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ABSTRACT: 

A method assigns importance ranks to nodes in a linked database, such as any 
database of documents containing citations, the world wide web or any other 
hypermedia database. The rank assigned to a document is calculated from the ranks 
of documents citing it. In addition, the rank of a document is calculated from a 
constant representing the probability that a browser through the database will 
randomly jump to the document. The method is particularly useful in enhancing the 
performance of search engine results for hypermedia databases, such as the world 
wide web, whose documents have a large variation in quality. 

29 Claims, 3 Drawing figures 
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ATT Y -AGENT- FIRM: Tran; Khanh Q. 



ABSTRACT: 

A system and method are provided for searching for desired items from a network of 
information resources. In particular, the system and method have advantageous 
applicability to searching for World Wide Web pages having desired content. An 
initial set of pages are selected, preferably by running a conventional keyword- 
based query, and then further selecting pages pointing to, or pointed to from, the 
pages found by the keyword-based query. Alternatively, the invention may be applied 
to a single page, where the initial set includes pages pointed to by the single 
page and pages which point to the single page. Then, iteratively, authoritativeness 
values are computed for the pages of the initial set, based on the number of links 
to and from the pages. One or more communities, or "neighborhoods", of related 
pages are defined based on the authoritativeness values thus produced. Such 
communities of pages are likely to be of particular interest and value to the user 
who is interested in the keyword-based query or the single page. 

57 Claims, 7 Drawing figures 
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L10: Entry 3 of 4 File: DWPI Sep 4, 2001 

DERWENT-ACC-NO: 2001-595486 
DERWENT-WEEK: 200467 

COPYRIGHT 2005 DERWENT INFORMATION LTD 

TITLE: Computer-implemented linked document scoring method applicable for analyzing 
linked databases involves processing linked documents based on linked document 
scores assigned according to linking document score 

INVENTOR: PAGE, L 

PRIORITY-DATA: 1997US-035205P (January 10, 1997), 1998US-0004827 (January 9, 1998) 
PATENT- FAMILY: 

PUB-NO PUB-DATE LANGUAGE PAGES MAIN-IPC 

US 6285999 Bl September 4, 2001 011 G06F017/30 

INT-CL (IPC) : G06 F 17/30 

ABSTRACTED- PUB-NO: US 6285999B 
BASIC-ABSTRACT: 

NOVELTY - The method involves obtaining number of hypertext documents (A, B,C) 
consisting of the linked documents and linking documents. Each of the linked 
documents is pointed by a link in one or more of the linking documents. A 
predetermined score is assigned to each of the linked documents based on the scores 
of one or more linking documents. The linked documents are processed according to 
the assigned scores. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for the following: 
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(a) a computer-implemented linked document ranking method; 

(b) and a computer-readable medium which stores processing instructions for linked 
document score determination. 

USE - Applicable for analyzing linked databases. 

ADVANTAGE - Simplifies determination of importance of document through counting of 
number of citations or backlinks . Enables ranking documents in large linked 
database even in world wide web. 

DESCRIPTION OF DRAWING (S) - The figure shows the diagram of three-document web 
illustrating rank in each document. 

Hypertext documents A,B,C 



DERWENT-ACC-NO: 2000-663779 
DERWENT-WEEK: .200064 

COPYRIGHT 2005 DERWENT INFORMATION LTD 

TITLE: Computer program product for searching information resources in WWW, directs 
computer system to produce final set of information resources based on produced 
initial and secondary authoritativeness information 

INVENTOR: KLEINBERG, J M 

PRIORITY-DATA: 1997US-08 13749 (March 7, 1997) 
PATENT- FAMILY: 

PUB-NO PUB-DATE LANGUAGE PAGES MAIN-IPC 

US 6112202 A August 29, 2000 016 G06F017/30 

INT-CL (IPC) : G06 F 17/30 

ABSTRACTED- PUB-NO: US 6112202A 
BASIC-ABSTRACT: 

NOVELTY - A main controller provided on the recording medium, directs the computer 
system to produce final set of information resources based on produced initial and 
secondary authoritativeness information about a set of information resources 
pointed to by links in resources of input set. The initial set and the succeeding 
sets of information are iterated until specific condition is attained. 

DETAILED DESCRIPTION - The controllers provided on the recording medium, direct the 
computer system to identify an initial set of information resources, and to define 
initial authoritativeness information for the initial set, respectively. Another 
controller provided on the recording medium, directs the computer system to 
generate a final set of information sources, based on the two generated 
authoritativeness information. The information resources include WWW pages and the 




D 4. Document ID: US 61 12202 A 

L10: Entry 4 of 4 



File: DWPI 



Aug 29, 2000 
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content based links including hyperlinks. INDEPENDENT CLAIMS are also included for 
the following: 

(a) search execution method of information resources; 

(b) search execution system of information resources 

USE - For executing search of information resources in hypertext/hyperlinked 
environments such as WWW. And also for defining communities of computer network 
user based on receiver e-mail messages, and for telephone call records. 

ADVANTAGE - Since the existing link structure is objectively observable, the 
evaluation of authoritativeness is automated. 

DESCRIPTION OF DRAWING (S) - The figure shows the flowchart illustrating the search 
execution method of information resources. 
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L12: Entry 1 of 1 



File: USPT 



Aug 29, 2000 



DOCUMENT- IDENTIFIER : US 6112202 A 

TITLE: Method and system for identifying authoritative information resources in an 
environment with content-based links between information resources 

Brief Summary Text (47) : 

Finally, "neighborhoods" or "communities" of pages are obtained from the resultant 
authoritativeness information. A single neighborhood may be obtained, or several 
distinct neighborhoods may be obtained by partitioning the scores into ranges. 

Detailed Description Text (14) : 

The method can be directly extended to produce several, relatively disjoint, 
communities of authorities and hubs. The method of the invention, when practiced in 
such a fashion, serves a clustering function. That is, the more disjoint these 
communities are, the more they are capable of corresponding to intuitive partitions 
of the query topic. The partitions may be made according to various criteria, 
including both semantic distinctions and social "clustering" among creators of 
hyper-links. 

Detailed Description Text (32) : 

In step 28, hub. and authorit y vectors H and A are defined, where each term of each. 
of tne vectors corresponds with one of the pages in the neighborhood. The iterative 
algorithm is to operate on these vectors . 

Detailed Description Text (34) : 

The vector H is initialized as follows: 

Detailed Description Text (38) : 

The entries in these two vectors are now updated iteratively (step 30) . One 
preferred method for performing this updating is given in flowchart form in the 
next several steps of FIG. 1, and depicted graphically in FIGS. 4 and 5. 

Detailed Description Text (39) : 

If u and v are pages, let u.fwdarw.v denote the presence of a link from u to v. 
Then the values of the terms of the hub and authority vectors H and A are updated 
as follows: 

Detailed Description Text (40) : 

These two equations are shown respectively as steps 32 and 34 in FIG. 1. Equation 
(1) is illustrated in FIG. 4, in which three pages ul, u2, and u3 have links to a 
page v. The authorit y vector's term A[v] for the page v is the sum of the hub 
vector values H[ul], H[u2], and H[u3] for the three pages ul, u2, and u3 . 

Detailed Description Text (41) : 

Similarly, Equation (2) is illustrated in FIG. 5, in which a page v has links to 
##EQU1## three pages ul, u2, and u3 . The hub vector's term H[v] for the page v is 
the sum of the authority vector values A[ul] , A[u2], and A[u3] for the three pages 
ul, u2, and u3 . 



http://westbrs:9000ftin/gate.exe?^ 7/7/05 



Record Display Form 



Page 2 of 4 



Detailed Description Text (42) : 

It will be seen that, as these iterations are performed, the values of the terms of 
the hub and authority vectors will increase. Accordingly, the vectors are 
preferably normalized, to prevent the numerical values from growing too large (step 
36) . One preferred normalization method is the following: ##EQU2## 

Detailed Description Text (44) : 

It will be seen that, as the successive iterations proceed, the hub and authority 
vector values will increase based on the number of links common to the page 
populations. The pages unrelated to the desired subject matter, which will have 
relatively few links to the pages related to the desired subject matter, will have 
relatively low values, and will, in effect, be "weeded out." 

Detailed Description Text (45) : 

When the iterations have been completed, FIG. 1 concludes by outputting its final 
results. A preferred output technique, given in steps 38 and 40, is to scan, the hub 
and authorit y vectors H and A, to find the k largest terms, k having been specified 
in step 2, and being presumptively smaller than the number of pages identified. 

Detailed Description Text (48) : 

The above-described method may be extended to locate several communities of 
authorities and hubs. Iterations are performed in essentially the same manner as 
described above, but now, several vectors of each type are maintained. For 
instance, if there are to be q hub vectors and q authority vectors, representing q 
number of distinct neighborhoods, then the hub and authority vectors are shown as 
distinguished by index subscripts, as follows: A.sub.O, . . . , A.sub.q and 
H.sub.O, . . . , H.sub.q. 

Detailed Description Text (50) : 

Initially, the implementation of FIG. 6 chooses the additional input parameter q, a 

number of neighborhoods (i.e., of hub and authority vectors ) to be found (step 42). 

In step 44, initial values for the terms of the hub and authority vectors are set. 

Detailed Description Text (51) : 

However, the initialization is preferably performed in a different manner from what 
was done in step 28 of FIG. 1. The objective of this embodiment is to come up with 
distinct neighborhoods. Consequently, it is necessary that the final result of the 
iterations be multiple distinct vectors . In order for the iterations to converge to 
multiple distinct vectors, it is necessary that no two of the vectors become equal 
during the course of the iterations. 

Detailed Description Text (52) : 

For this purpose, the vectors are initialized so as to be orthogonal. Moreover, 
following each iteration, they are again updated so as to remain orthogonal. This 
updating step can be accomplished by the standard Gram-Schmidt procedure, as given 
in G. Golub, C.F. Van Loan, "Matrix Computations", Johns Hopkins University Press, 
1989. 

Detailed Description Text (53) : 

In light of the foregoing, the preferred embodiment of the invention is as follows: 
Before the iterations begin, in step 46 the hub vectors are orthogonalized. The 
initial orthogonalization may conveniently be performed by assigning each 
coordinate a real-number value chosen uniformly at random from the interval [0,1]. 

Detailed Description Text (54) : 

The iterations are now performed (step 48). For a given iteration, the summing, 
similar to those given above in Equations (1) and (2), is done separately over each 
pair of hub and authority vectors (A.sub.i, H.sub.i). 

Detailed Description Text (55) : 
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At the end of each iteration, the vectors are modified to be mutually orthogonal. 
This can be accomplished by the standard Gram-Schmidt procedure given in G. Golub 
(supra) . 

Detailed Description Text (56) : 

A preferred sequence of the steps of an iteration are given in FIG. 6, as follows: 
In step 50, the authority vectors are updated. When they are all updated, they are 
then orthogonalized (step 52) . Then, in step 54, the hub vectors are updated. When 
they are all updated, they are then orthogonalized (step 56) . This completes an 
iteration. The iteration is repeated a desired number of times. 

Detailed Description Text (57) : 

As with the embodiment of FIG. 1, the largest (positive) entries of A.sub.O and 
H.sub.O are returned as the primary hubs and authorities. One can then define 2q 
additional authority/hub communities, by taking the q most positive and the q most 
negative entries from each of the pairs of vectors (A.sub.i, H.sub.i), for i=l, . . 
. , q. 

Detailed Description Text (58) : 

Note that the Gram-Schmidt procedure, which includes subtractions, can produce 
negative values for vector terms. The positivity or negativity of the entries does 
not have a direct meaning in the context of the method. Rather, a more significant 
meaning is attributed to the magnitudes, i.e., absolute values, of the terms. In 
general, the more links to or from a page, or, more broadly, the greater the 
authoritativeness of the page as to the desired subject matter, the greater the 
magnitude of the value will be. 

Detailed Description Text (59) : 

The noteworthy property of the entries, taken as a group, is simply that they may 
be partitioned into two or more communities, based on their ranges of values. It 
may be convenient or desirable, where one set is positive and the other set is 
negative, to partition at the zero value. However, it is not crucial that the 
partitions be evenly distributed or symmetric. More generally, any subset of the 
communities can be returned, possibly according to additional criteria imposed by 
the user on the set of pages. 

Detailed Description Text (60) : 

For discussion purposes, however, an example will be given in which partitioning is 
to be symmetric about the zero point. 

Detailed Description Text (64): 

In step 64, a community, indexed as community 2i-l (for Ki<q) , is defined, by 

choosing k pages with largest coordinates in the vector H[i] as hubs (step 66), and 

choosing k pages with largest coordinates in the vector H[i] as hubs (step 66). 

Detailed Description Text (65): 

Next, in step 70, a community, indexed as community 2i (for Ki<q) , is defined, by 
choosing k pages with smallest coordinates in the vector H[i] as hubs (step 66), 
and choosing k pages with smallest coordinates in the vector H[i] as hubs (step 
66) . 

Detailed Description Text (69): 

The hub and authority vectors H and A correspond to the principal eigenvectors of 
two matrices associated with the set of pages. 

Detailed Description Text (72) : 

In particular, the authorit y vector A is the principal eigenvector of M, and the 
nUD vector H is the principal eigenvector of N. 

Detailed Description Text (73) : 
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